Package schrodinger :: Package pipeline :: Module pipeline

Module pipeline

Classes for running pipelines.

The main class is called Pipeline. This class parses the input file, creates appropriate stages, and runs them in their own subdirectories.

The StageJob class represents a pipeline job linked to a specific stage.

The IO (In/out object) classes (defined in pipeio.py) represent information that is from one stage to another, such as a list of files. They are also called Variables.

Input Object Syntax

The Pipeline input file is used to specify which stages to run, how to run them (parameters), what to use for input, and where to send the output. An example input file looks like:

   SET MY_INPUT
       VARCLASS Structures
       FILE /home/adzhigir/vsw_testing/20confs.mae

The SET line value (MY_INPUT) specifies the name of the IO object. The VARCLASS value (Structures) specifies the PipeIO class to create. Pipeline uses VARCLASS to determine which variable to create. Pipeline will search schrodinger.pipeline.pipeio module for the class name specified of this line. If it is not found there, it assumes a custom class is specified as absolute path. (In this case, make sure the custom module is in your PYTHONPATH.)

All lines following VARCLASS are used to define what information to put into this variable, in this case it is a Maestro file (20confs.mae).

Stage Syntax

An example stage file looks like:

   STAGE MY_STAGE
       STAGECLASS  macromodel.ConfSearchStage
       INPUT       MY_INPUT
       OUTPUT      MY_OUTPUT
       FFLD        MMFFS

The STAGE line value (MY_STAGE) specifies the name of the stage. The STAGECLASS keyword specifies <module>.<class name> that defines the stage. Pipeline uses STAGECLASS to determine which stage to create. Pipeline will search schrodinger.pipeline.stages namespace as well. Please make sure the module is in your PYTHONPATH.

See schrodinger.pipeline.stages.combine for an example on how to write a stage module.

Input variables for the stage are specified via INPUT keywords, and outputs via OUTPUT keywords. The rest of the keywords tell the stage how to run.

If you wish to run the Pipeline without using the pipeline startup machinery:

   p = pipeline.Pipeline([options])
   p.readFile(<input file>)
   try:
       p.run()
   except RuntimeError:
       ...

If restartability is important, specify the restart_file when constructing the Pipeline object.

To restart Pipeline, do:

   p = pipeline.Restart(restart_file [, new options]),
   try:
       p.run()
   except RuntimeError:
       ...

where restart_file is the same file that you specified to this constructor when the initial instance was created.

Classes

[hide private]

StageJob
A "Job" that is used by Pipeline to run a Stage.

Pipeline
A controller responsible for running the stages in the correct order.

Functions

[hide private]

log(text)
Prints specified text to the log pipe; adds a return at the end

logn(text)
Print the specified text to the log pipe with no newline.

add_host_lists(list1, list2)
Append hosts in list2 to list1.

subtract_host_lists(list1, dict2)
Return available (not used) hosts.

_host_is_queue(hostname)
Return True if hostname is a queue.

_get_job(jobid)
Return the job object for job with jobid.

importName(modulename, name)
Import a named object from a module in the context of this function.

Restart(restart_file, restartbeg=False)
Recover a saved Pipeline instance.

Variables

[hide private]

__doc__ = ...

DEBUG = False
hash(x)

WAITING = 'WAITING'

RUNNING = 'RUNNING'

COMPLETED = 'COMPLETED'

FAILED = 'FAILED'

RESTARTING = 'RESTARTING'

updated_from_dump = {}

old_stage_classes = {'chargefilter.ChargeFilterStage': 'filter...

global_logfh = sys.stdout

_job_objects = {}

__package__ = 'schrodinger.pipeline'

Function Details

[hide private]

logn(text)

Print the specified text to the log pipe with no newline. This is especially useful when printing progress periods.

add_host_lists(list1, list2)

Append hosts in list2 to list1.

Example:

 list1 = a:5,b:10
 list2 = a:2,c:12
 output = a:7,b:10,c:12

The order of hosts is retained (first list is given priority).

subtract_host_lists(list1, dict2)

Return available (not used) hosts. This function subtracts the host dict dict2 from the host dict list1.

Parameters:

list1 (dict) - All available hosts (specified by user), with hostname as key and cpu count as value.
dict2 (dict) - All used hosts (used by stages)

importName(modulename, name)

Import a named object from a module in the context of this function.

For example, if you would like to create an instance of the Foo class from the bar.py module:

   foo_class = importName("bar", "Foo")
   foo_instance = foo_class()

Raises:

ImportError - Raised when the object can not be imported.

Restart(restart_file, restartbeg=False)

Recover a saved Pipeline instance.

Specify new options only if the settings need to change.

Returns a Pipeline instance recovered from the restart_file. You need to call pipeline.run() in order to get the pipeline running.

Parameters:

restartbeg - Whether to start failed stages from beginning.

Raises:

RuntimeError - Raised if a Pipeline can't be loaded from the specified file.

Variables Details

[hide private]

doc

Value:

"""
Classes for running pipelines.

The main class is called Pipeline. This class parses the input file, c
reates
appropriate stages, and runs them in their own subdirectories.

The StageJob class represents a pipeline job linked to a specific stag
...

old_stage_classes

Value:

{'chargefilter.ChargeFilterStage': 'filtering.ChargeFilterStage',
 'gencodes.GenCodesStage': 'gencodes.Recombine',
 'ligfilter.LigFilterStage': 'filtering.LigFilterStage',
 'merge.MergeStage': 'glide.MergeStage',
 'mmgbsa.MMGBSAStage': 'prime.MMGBSAStage',
 'mopac.MopacStage': 'semiemp.SemiEmpStage',
 'phase.DBConfSites': 'phase.DBConfSitesStage',
 'phase.DBManage': 'phase.DBManageStage'}