schrodinger.job.jobcontrol module

Core job control for python.

There are currently four major sections of this module - “Job database,” “Job launching,” “Job backend,” and “Job hosts.” The job database section deals with getting info about existing Jobs, the job launching section deals with starting up a subjob, and the job backend section provides utilities for a python script running as a job.

The philosophy of this module is to reinvent as little as possible (i.e. use the mmjob C libs where we can) and keep things fairly simple. If we need more complicated features of job control, we can hopefully add them as we go along.

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.job.jobcontrol.timestamp(msg)
exception schrodinger.job.jobcontrol.JobcontrolException

Bases: Exception

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception schrodinger.job.jobcontrol.JobLaunchFailure

Bases: schrodinger.job.jobcontrol.JobcontrolException, RuntimeError

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception schrodinger.job.jobcontrol.MissingFrontendException

Bases: schrodinger.job.jobcontrol.JobcontrolException

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception schrodinger.job.jobcontrol.MissingHostsFileException

Bases: schrodinger.job.jobcontrol.JobcontrolException

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception schrodinger.job.jobcontrol.UnreadableHostsFileException

Bases: schrodinger.job.jobcontrol.JobcontrolException

__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class schrodinger.job.jobcontrol.Job(job_id, username='', file=None, launch_output=None, launch_error=None, handle=None, manage_handle=True)

Bases: object

A class to access a specific record in the job database.

A Job instance is always a snapshot of the job record at a specific point in time. It is only updated when the readAgain method is explicitly invoked.

Standard scalar attributes that can be accessed from Job objects include:

  • JobID
  • Name
  • Program
  • Processors
  • Host
  • User
  • Dir
  • HostsFile
  • HostEntry
  • JobHost
  • JobUser
  • JobDir
  • Status
  • ExitStatus
  • ExitCode
  • Command
  • Env
  • Home
  • LaunchTime
  • StartTime

Standard list attributes that can be accessed from Job objects include:

  • InputFiles
  • OutputFiles
  • LogFiles
  • MonitorFiles
  • SubJobs
  • Envs

Not all of these attributes are always present. Their presence can be checked with hasattr(), and a full list of the available attributes for a specific Job object can be retrieved with the keys() method.

WARNING: To instantiate a Job object, the Job Database in which it is stored must be accessible from the current process. For example, when a Windows workflow is submitted to a Linux machine, any children that it spawns can not access their parent’s job record, because the home directory isn’t shared.

__init__(job_id, username='', file=None, launch_output=None, launch_error=None, handle=None, manage_handle=True)

Initialize a read-only Job object.

There are a three ways that a Job object can be created, and the optional arguments reflect this. The first way is with only a job_id, e.g. captured from stdout of a launch. This Job object will rely on the database to find information. If created from a launch, stderr and stdout of the launch command can be attached.

The second way is with a file specification. This will create a Job object that is independent of the standard job database.

A final way is to provide an existing mmjob handle.

With the file or mmjob handle initialization methods any job_id value will be overwritten with the value retrieved from the mmjob handle. You can simply provide None as a placeholder value.

Parameters

username (str)
The username is not required and should not generally be specified. It is provided mainly for backward compatibility.
launch_output (str)
The captured stdout from the launch process that created the job.
launch_error (str)
The captured stderr from the launch process that created the job.
manage_handle (bool)
Obsolete.
readAgain()

Reread the database. Calling this routine is necessary to get fresh values.

keys()

Return a list of keys present in the last read of the job control record.

get(name, default=None)

Return the last read value of a job control key. Returns default if the job does not have the key ‘name’.

writeFile(filename=None)

Writes the job record to a file.

If no filename is given, the filename is the JobId.

isComplete()

Returns True if the job is complete.

This method uses a native mmjob logic to determine if the job is complete or not.

isQueued()

Returns True if the job a batch queue job or grid job.

succeeded()

Returns False if the job was killed, died or fizzled. Returns True if ExitStatus is finished.

Raises an exception if the job isn’t completed, so use isComplete() before calling.

setStatusIncorporated()

Set the status of the job to “incorporated” if the job has completed.

sendMessage(msg)

Send an arbitrary text message to the job’s jmonitor process.

This should not normally be necessary.

kill()

Kill the job if it is running.

wait(max_interval=60)

Wait for the job to complete; sleeping up to ‘interval’ seconds between each database check. (Interval increase gradually from 2 sec up to the maximum.)

NOTE: Do not use if your program is running in Maestro, as this will make Maestro unresponsive while the job is running.

summary()

Return a string summarizing all current Job attributes.

getDuration()

Returns the wallclock time of the job if it is complete. StopTime - StartTime. Units are seconds (float). If job is not complete, returns None.

Status

Get the Status of the job.

ExitStatus

Get the ExitStatus of the job.

getApplicationHeaderFields(default=None)
Returns:An OrderedDict of essential jobcontrol keyword:value pairs used to standardize application log files.
Parameters:default (any) – Value assigned to a keyword if the corresponding attribute is not defined.

Keywords include: ‘JobId’, ‘Name’, ‘Program’, ‘MMshareExec’, ‘Host’, ‘Dir’, ‘HostEntry’, ‘Queue’, ‘JobHost’, ‘JobDir’, ‘JobMMshareExec’, ‘Commandline’, and ‘StartTime’.

getApplicationHeaderString(field_sep=' : ')
Returns:A string of essential jobcontrol parameters, in a preferred order, with simple formatting.
Parameters:field_sep (str) – String that delimits the keyword and value.
Note:‘Queue’ only appears in the header string if it is defined in the jobrecord.

Example:

jobbe = schrodinger.job.jobcontrol.get_backend()
if jobbe:
    print jobbe.getJob().getApplicationHeaderString()
getInputFiles()

Get list of InputFiles and Transfers marked for input

InputFiles

Get list of InputFiles and Transfers marked for input

getOutputFiles()

Get list of OutputFiles and Transfers marked for output

OutputFiles

Get list of OutputFiles and Transfers marked for output

getProgressAsPercentage()

Get the value of backend job progress in terms of percentage. Return 0.0 when a job is not yet in running state.

getProgressAsSteps()

Get the value of backend job progress in terms of steps and totalsteps. Return (0,1) when a job is not yet in ‘running’ state.

getProgressAsString()

Get the value of backend job progress in terms of descriptive text. Return “The job has not yet started.” when a job is not yet in running state.

purgeRecord()

Purge the job record for the job from the database.

schrodinger.job.jobcontrol.read(file_)

Create and return a Job object from a jobcontrol file.

schrodinger.job.jobcontrol.get_active_jobs()

Returns list of jobs that are not completed.

Returns:list of Job objects
schrodinger.job.jobcontrol.get_jobs_by_program(program_name)

Find jobs with a specific Program attribute.

Parameters:program_name (str) – program name
Return type:list(Job)
Returns:The list of jobs in the jobdb that are associated with the given program. Each item of the list is a Job object.
schrodinger.job.jobcontrol.launch_job(cmd, print_output=False, expandvars=True, launch_dir=None, timeout=None)

Run a process under job control and return a Job object. For a process to be under job control, it must print a valid JobId: line to stdout. If such a line isn’t printed, a RuntimeError will be raised.

The cmd argument should be a list of command arguments (including the executable) as expected by the subprocess module.

If the executable is present in $SCHRODINGER or $SCHRODINGER/utilities, an absolute path does not need to be specified.

NOTE: UI events will be processed while jlaunch is executing.

Parameters:
  • print_output (bool) – Determines if the output from jlaunch is printed to the terminal or not. Output will be logged (to stderr by default) if Python or JobControl debugging is turned on or if there is a launch failure, even if ‘print_output’ is False.
  • expandvars (bool) – If True, any environment variables of the form $var or ${var} will be expanded with their values by the os.path.expandvars function.
  • lauch_dir – Launch the job from the specified directory. Will CD into this directory before launching the job, and CD out of it before running the event loop while jlaunch is running.
  • timeout (float or None) – Timeout (in seconds) to be applied while waiting for the job control launch process to start or finish. This allows launch_job() to return sooner if the job is unable to launch. If None, the process will run without a timeout.
Raises:
  • RuntimeError – If there is a problem launching the job (e.g., no JobId gets printed). If running within Maestro, an error dialog will first be shown to the user.
  • OSError – If launch_dir doesn’t exist.
schrodinger.job.jobcontrol.prepend_schrodinger_run(cmd)

Check if a command executes a Python script and prepend $SCHRODINGER/run to the command if it does not already begin with it.

Parameters:cmd (list(str)) – Command to prepend $SCHRODINGER/run to.
schrodinger.job.jobcontrol.fix_cmd(cmd, expandvars=True)

A function to clean up the command passed to launch_job.

Parameters:
  • cmd (list of strings) – A command in a form that can be passed to subprocess.Popen.
  • expandvars (bool) – If True, any environment variables of the form $var or ${var} will be expanded with their values by the os.path.expandvars function.
Returns:

The command to be launched

schrodinger.job.jobcontrol.list2jmonitorcmdline(cmdlist)

Turn a command in list form to a single string that can be executed by jmonitor.

schrodinger.job.jobcontrol.launch_from_job_spec(job_spec, launch_parameters, display_commandline=None)

Launch a job based on its specification.

Parameters:
Returns:

A schrodinger.job.jobcontrol.Job object.

schrodinger.job.jobcontrol.get_backend()

A convenience function to see if we’re running under job control. If so, return a _Backend object. Otherwise, return None.

schrodinger.job.jobcontrol.get_runtime_path(pathname)

Return the runtime path for the input file ‘pathname’.

If the pathname is of a type that job control will not copy to the job directory or no runtime file can be found, returns the original path name.

schrodinger.job.jobcontrol.under_job_control()

Returns True if this process is running under job control; False otherwise.

class schrodinger.job.jobcontrol.Host(name)

Bases: object

A class to encapsulate host info from the schrodinger.hosts file.

Use the module level functions get_host or get_hosts to create Host instances.

Variables:
  • name – Label for the Host.
  • user – Username by which to run jobs.
  • processors – Number of processors for the host/cluster.
  • tmpdir – Temporary/scratch directory to use for jobs. List
  • schrodinger – $SCHRODINGER installation to use for jobs.
  • env – Variables to set in the job environment. List.
  • gpgpu – GPGPU entries. List.
  • queue – Queue entries only. Queue type (e.g., SGE, PBS).
  • qargs – Queue entries only. Optional arguments passed to the queue submission command.
__init__(name)

Create a named Host object. The various host attributes must be set after object instatiation.

Only host-entry fields can be public attributes of a Host object. Attributes introduced to capture other information about the entry must be private (named with a leading underscore.)

to_hostentry()

Return a string representation of the Host object suitable for including in a hosts file.

getHost()

Return the name of the host, which defaults to ‘name’ if a separate ‘host’ attribute wasn’t specified.

setHost(host)

Store host as _host to allow us to use a property for the ‘host’ attr.

host

Return the name of the host, which defaults to ‘name’ if a separate ‘host’ attribute wasn’t specified.

isQueue()

Check to see whether the host represents a batch queue. Returns True if the host is a traditional queue or a grid host.

schrodinger.job.jobcontrol.get_hostfile()

Return the name of the schrodinger.hosts file last used by get_hosts(). The file is found using the standard search path ($SCHRODINGER_HOSTS, local dir, $HOME/.schrodinger, $SCHRODINGER).

schrodinger.job.jobcontrol.hostfile_is_empty(host_filepath)

Return if the given host_filepath host is empty, meaning it contains only the localhost entry. If the host_filepath str is empty or invalid, then this function will raise an invalid path exception - IOError.

Parameters:host_filepath (str) – schrodinger.hosts file to use.
Returns:bool
schrodinger.job.jobcontrol.get_installed_hostfiles(root_dir='')

Return the pathname for the schrodinger.hosts file installed in the most recent previous installation directory we can find.

If a root pathname is passed in, previous installations are searched for there. Otherwise, we look in the standard install locations.

schrodinger.job.jobcontrol.get_hosts()

Return a list of all Hosts in the schrodinger.hosts file. After this is called, get_hostfile() will return the pathname for the schrodinger.hosts file that was used. Raises UnreadableHostsFileException or MissingHostsFileException on error.

schrodinger.job.jobcontrol.hostfile_is_valid(fname)
Parameters:fname (str) – The full path of the host file to validate
Returns:a (bool, str) tuple indicating whether the host file is valid
Return type:tuple
schrodinger.job.jobcontrol.to_hostfile(hosts)

Return a string representation for the given list of Host objects that would be suitable for use as a hosts file.

schrodinger.job.jobcontrol.sorted_hosts(hosts)

Return a list of the given Host objects re-ordered to satisfy the conditions that, 1. the localhost entry (if any) is first, and 2. any entry used as a base entry is defined before it’s used. Otherwise, the input order is preserved.

schrodinger.job.jobcontrol.get_host(name)

Return a Host object for the named host. If the host is not found, we return a Host object with the provided name and details that match localhost. This matches behavior that jobcontrol uses. Raises UnreadableHostsFileException or MissingHostsFileException on error.

schrodinger.job.jobcontrol.host_str_to_list(hosts_str)

Convert a hosts string (Ex: “galina:1 monica:4”) to a list of tuples. First value of each tuple is the host, second value is # of cpus.

schrodinger.job.jobcontrol.host_list_to_str(host_list)

Converts a hosts list (Ev: [ (‘host1’,1), (‘host2’, 10) ] ) to a string. Output example: “host1:1,host2:10”

schrodinger.job.jobcontrol.get_command_line_host_list()

Return a list of (host, ncpu) tuples corresponding to the host list that is specified on the command line.

This function is meant to be called by scripts that are running under a toplevel job control script but are not running under jlaunch.

The host list is determined from the following sources:
  1. SCHRODINGER_NODELIST
  2. JOBHOST (if only a single host is specified)
  3. “localhost” (if no host is specified)

If no SCHRODINGER_NODELIST is present in the environment, None is returned.

schrodinger.job.jobcontrol.get_backend_host_list()

Return a list of (host, ncpu) tuples corresponding to the host list as determined from the SCHRODINGER_NODEFILE.

This function is meant to be called from scripts that are running under jlaunch (i.e. backend scripts).

Returns None if SCHRODINGER_NODEFILE is not present in the environment.

schrodinger.job.jobcontrol.calculate_njobs(host_list=None)

Derive the number of jobs from the specified host list. This function is useful to determine number of subjobs if user didn’t specified the ‘-NJOBS’ option.

Parameters:host_list (String or List of tuples) – String of hosts along with optional number of cpus or List of tuples consist of hosts along with optional number of cpus

If host list is not specified then it uses get_command_line_host_list() to determine njobs, else uses the user provided host list.

Please note: the host_list can be passes as string or as a list of tuples.
For example: for -GLIDE_HOST “host1 host2:4” host_list would become
[(‘host1’, None), (‘host2’, 4)]
schrodinger.job.jobcontrol.is_valid_hostip(hostip)

Checks if it is a valid ip address. Logic comes from mmjob.cpp

Parameters:
  • hostip – host ip address
  • type – string
schrodinger.job.jobcontrol.is_valid_hostname(hostname)

Checks if the hostname is valid.

Parameters:
  • hostname – host name
  • type – string
schrodinger.job.jobcontrol.get_jobname(filename=None)

Figure out the jobname from the first available source: 1) the SCHRODINGER_JOBNAME environment variable (comes from -JOBNAME during startup); 2) the job control backend; 3) the basename of a given filename.

Parameters:filename (str) – if provided, and the jobname can’t otherwise be determined, (e.g., running outside job control with no -FILENAME argument), construct a jobname from its basename.
Returns:jobname (may be None if filename was not provided)
Return type:str