Jobcontrol¶
Jobcontrol is a way to allow tasks to run asynchronously, and provides support for starting tasks on a different machine than their execution.
For example, you may launch a task from a laptop (running Maestro) to a compute node, so that the task runs on several cores. Jobcontrol takes care of transferring input files from your laptop to the cluster and collecting results and log files once the job is complete.
How to launch a job¶
Launching a job means running a command with -HOST <host entry argument>. A host entry is currently defined in schrodinger.hosts files.
Example:
$SCHRODINGER/ligprep -imae in.mae -omae out.mae
Running with no arguments runs on localhost. Adding -HOST bolt_cpu would submit the job to bolt.
Job Model¶
From the commandline perspective, a job consists of a short script that takes care of submitting the job, and will return with output of: JobId: <jobid>
If the command returns with a zero exit status and JobId, the job was successfully started. This should take seconds for a small job, or the time to negotiate start with the remote host. Then, the job is running in the background.
Running code under jobcontrol¶
Python scripts that run locally can be adapted to run remotely. jobcontrol will use launchapi if the script defines a function get_job_spec_from_args
at the top level. $SCHRODINGER/run
will use the information returned from that function when a -HOST
option is used. For example:
$SCHRODINGER/run script.py -HOST localhost
will execute the main function under jobcontrol on the localhost by using the information returned from get_job_spec_from_args
.
For documentation of full set of options.
Ordinary script¶
For a script that executes normally (myscript.py), you only need to make sure that your script is importable as a module. In this example, myscript will simply print out the hostname that the script is running on to show that our script that will have different outputs on different machines.
import socket
def main():
print(socket.gethostname())
if __name__ == "__main__":
main()
$SCHRODINGER/run myscript.py
will print out your local hostname.
Add jobcontrol API¶
If we want to execute our script under jobcontrol, locally or remotely, we need to add a function at the top level that jobcontrol can use as a job specification. This function must be called get_job_spec_from_args
. Here, we’re registering stderr and stdout so that we can see the output of the script:
import socket
from schrodinger.job import launchapi
def get_job_spec_from_args(argv):
"""
Return a JobSpecification necessary to run this script on a remote
machine (e.g. under job control with the launch.py script).
:type argv: list(str)
:param argv: The list of command line arguments, including the script name
at [0], matching $SCHRODINGER/run __file__ sys.argv
"""
job_builder = launchapi.JobSpecificationArgsBuilder(argv)
job_builder.setStderr(“myscript.log”)
job_builder.setStdout(“myscript.log”)
return job_builder.getJobSpec()
def main():
print(socket.gethostname())
if __name__ == "__main__":
main()
Assuming that myscript.py is in the distribution on your local and remote computers:
$SCHRODINGER/run myscript.py
will print out your local hostname.
$SCHRODINGER/run myscript.py -HOST bolt_cpu
will log the hostname of bolt compute node
Register input and output files¶
Files that are transferred from the launch machine to the compute machine need to be registered by job control. In this example, we have an input maestro file and an output maestro file.
import os
import sys
from schrodinger import structure
from schrodinger.job import launchapi
def get_job_spec_from_args(argv):
job_builder = launchapi.JobSpecificationArgsBuilder(argv)
mae_file = argv[1]
output_mae_file = os.path.basename(mae_file) + "processed.mae"
job_builder.setInputFile(mae_file)
job_builder.setOutputFile(output_mae_file)
job_builder.setStderr("myscript.log")
job_builder.setStdout("myscript.log")
return job_builder.getJobSpec()
def main():
output_file = os.path.basename(sys.argv[1]) + "processed.mae"
with structure.StructureReader(sys.argv[1]) as reader:
with structure.StructureWriter(output_file) as writer:
for ct in reader:
ct.title = ct.title + "processed"
writer.append(ct)
if __name__ == "__main__":
main()
Execute using: $SCHRODINGER/run myscript.py foo.mae -HOST localhost
Using a jobname¶
Some jobs use the concept of a jobname, which is specified through command line or maestro to to determine the names of log files for the job.
import socket
from schrodinger.job import launchapi
def get_job_spec_from_args(argv):
job_builder = launchapi.JobSpecificationArgsBuilder(argv, use_jobname_log=True)
return job_builder.getJobSpec()
def main():
print(socket.gethostname())
if __name__ == "__main__":
main()
Execute using: $SCHRODINGER/run myscript.py -JOBNAME foo -HOST localhost
Maestro Incorporation¶
A single maestro file from a job can be marked for incorporation into maestro, meaning that those structures will show up in the project table.
def get_job_spec_from_args(argv):
job_builder = launchapi.JobSpecificationArgsBuilder(argv)
job_builder.setOutputFile("foo.mae", incorporate=True)
return job_builder.getJobSpec()
Using $SCHRODINGER/run -FROM <product>¶
Some scripts require $SCHRODINGER/run -FROM <product> to run. In this case, we mark this when we a create JobSpecification:
def get_job_spec_from_args(argv):
job_builder = launchapi.JobSpecificationArgsBuilder(argv, schrodinger_product="scisol")
return job_builder.getJobSpec()
Integration into af2¶
af2 is the framework that Schrodinger uses to write GUIs. Implement getJobSpec() in panel to create a job spec. We assume we want to execute myscript.py
that we wrote above.:
def getJobSpec(self):
driver_path = 'myscript.py'
cmd = [driver_path, self.input_selector.structFile()]
return driver.get_job_spec_from_args(cmd)
Integration with an Argument Parser¶
An argument parser is useful when we want to document, validate, and access command line arguments within a script. It is easy to integrate an argument parser into a script that uses jobcontrol.
import argparse
import os
import sys
from schrodinger import structure
from schrodinger.job import launchapi
from schrodinger.utils import cmdline
def parse_args(argv):
parser = argparse.ArgumentParser()
parser.add_argument("inputfile", help="maestro file input")
args = parser.parse_args(argv)
return args
def get_job_spec_from_args(argv):
# first argument is this script
args_namespace = parse_args(argv[1:])
job_builder = launchapi.JobSpecificationArgsBuilder(argv, use_jobname_log=True)
job_builder.setInputFile(args_namespace.inputfile)
jobname = os.path.splitext(os.path.basename(args_namespace.inputfile))[0]
job_builder.setJobname(jobname)
return job_builder.getJobSpec()
def main(*argv):
args = parse_args(argv)
with structure.StructureReader(args.inputfile) as reader:
for ct in reader:
print("ct title={}".format(ct.title))
if __name__ == '__main__':
cmdline.main_wrapper(main, *sys.argv[1:])
See documentation of full set of options using in code documentation.