Module pipeline
Classes for running pipelines.
The main class is called Pipeline. This class parses the input file,
creates appropriate stages, and runs them in their own
subdirectories.
The StageJob class represents a pipeline job linked to a specific
stage.
The IO (In/out object) classes (defined in pipeio.py) represent
information that is from one stage to another, such as a list of files.
They are also called Variables.
Input Object Syntax
The Pipeline input file is used to specify which stages to run, how
to run them (parameters), what to use for input, and where to send the
output. An example input file looks like:
SET MY_INPUT
VARCLASS Structures
FILE /home/adzhigir/vsw_testing/20confs.mae
The SET
line value (MY_INPUT
) specifies
the name of the IO object. The VARCLASS
value
(Structures
) specifies the PipeIO class to create.
Pipeline uses VARCLASS to determine which variable to create. Pipeline
will search schrodinger.pipeline.pipeio module for the class name
specified of this line. If it is not found there, it assumes a custom
class is specified as absolute path. (In this case, make sure the
custom module is in your PYTHONPATH
.)
All lines following VARCLASS
are used to define what
information to put into this variable, in this case it is a Maestro
file (20confs.mae
).
Stage Syntax
An example stage file looks like:
STAGE MY_STAGE
STAGECLASS macromodel.ConfSearchStage
INPUT MY_INPUT
OUTPUT MY_OUTPUT
FFLD MMFFS
The STAGE
line value (MY_STAGE
) specifies
the name of the stage. The STAGECLASS
keyword specifies
<module>.<class name>
that defines the stage.
Pipeline uses STAGECLASS
to determine which stage to
create. Pipeline will search schrodinger.pipeline.stages namespace as
well. Please make sure the module is in your
PYTHONPATH
.
See schrodinger.pipeline.stages.combine for an example on
how to write a stage module.
Input variables for the stage are specified via INPUT
keywords, and outputs via OUTPUT
keywords. The rest of the
keywords tell the stage how to run.
If you wish to run the Pipeline without using the pipeline startup
machinery:
p = pipeline.Pipeline([options])
p.readFile(<input file>)
try:
p.run()
except RuntimeError:
...
If restartability is important, specify the
restart_file
when constructing the Pipeline object.
To restart Pipeline, do:
p = pipeline.Restart(restart_file [, new options]),
try:
p.run()
except RuntimeError:
...
where restart_file
is the same file that you specified
to this constructor when the initial instance was created.
Copyright Schrodinger, LLC. All rights reserved.
|
StageJob
A "Job" that is used by Pipeline to run a Stage.
|
|
Pipeline
A controller responsible for running the stages in the correct
order.
|
|
log(text)
Prints specified text to the log pipe; adds a return at the end |
|
|
|
logn(text)
Print the specified text to the log pipe with no newline. |
|
|
|
|
|
|
|
_host_is_queue(hostname)
Return True if hostname is a queue. |
|
|
|
_get_job(jobid)
Return the job object for job with jobid . |
|
|
|
importName(modulename,
name)
Import a named object from a module in the context of this function. |
|
|
|
Restart(restart_file,
restartbeg=False)
Recover a saved Pipeline instance. |
|
|
|
__doc__ = ...
|
|
DEBUG = False
hash(x)
|
|
WAITING = ' WAITING '
|
|
RUNNING = ' RUNNING '
|
|
COMPLETED = ' COMPLETED '
|
|
FAILED = ' FAILED '
|
|
RESTARTING = ' RESTARTING '
|
|
updated_from_dump = { }
|
|
old_stage_classes = { ' chargefilter.ChargeFilterStage ' : ' filter ...
|
|
global_logfh = sys.stdout
|
|
_job_objects = { }
|
|
__package__ = ' schrodinger.pipeline '
|
Print the specified text to the log pipe with no newline. This is
especially useful when printing progress periods.
|
add_host_lists(list1,
list2)
|
|
Append hosts in list2 to list1.
Example:
list1 = a:5,b:10
list2 = a:2,c:12
output = a:7,b:10,c:12
The order of hosts is retained (first list is given priority).
|
subtract_host_lists(list1,
dict2)
|
|
Return available (not used) hosts. This function subtracts the host
dict dict2 from the host dict list1 .
- Parameters:
list1 (dict) - All available hosts (specified by user), with hostname as key and
cpu count as value.
dict2 (dict) - All used hosts (used by stages)
|
importName(modulename,
name)
|
|
Import a named object from a module in the context of this
function.
For example, if you would like to create an instance of the Foo class
from the bar.py module:
foo_class = importName("bar", "Foo")
foo_instance = foo_class()
- Raises:
ImportError - Raised when the object can not be imported.
|
Restart(restart_file,
restartbeg=False)
|
|
Recover a saved Pipeline instance.
Specify new options only if the settings need to change.
Returns a Pipeline instance recovered from the restart_file. You need
to call pipeline.run() in order to get the pipeline
running.
- Parameters:
restartbeg - Whether to start failed stages from beginning.
- Raises:
RuntimeError - Raised if a Pipeline can't be loaded from the specified file.
|
__doc__
- Value:
"""
Classes for running pipelines.
The main class is called Pipeline. This class parses the input file, c
reates
appropriate stages, and runs them in their own subdirectories.
The StageJob class represents a pipeline job linked to a specific stag
...
|
|
old_stage_classes
- Value:
{ ' chargefilter.ChargeFilterStage ' : ' filtering.ChargeFilterStage ' ,
' gencodes.GenCodesStage ' : ' gencodes.Recombine ' ,
' ligfilter.LigFilterStage ' : ' filtering.LigFilterStage ' ,
' merge.MergeStage ' : ' glide.MergeStage ' ,
' mmgbsa.MMGBSAStage ' : ' prime.MMGBSAStage ' ,
' mopac.MopacStage ' : ' semiemp.SemiEmpStage ' ,
' phase.DBConfSites ' : ' phase.DBConfSitesStage ' ,
' phase.DBManage ' : ' phase.DBManageStage ' }
|
|