schrodinger.application.phase.packages.phase_screen_driver_utils module¶
Module with functionality used by phase_screen_driver.py.
Copyright Schrodinger LLC, All Rights Reserved.
-
class
schrodinger.application.phase.packages.phase_screen_driver_utils.
SourceFormat
¶ Bases:
enum.Enum
An enumeration.
-
database
= 2¶
-
file
= 1¶
-
project
= 3¶
-
Adds options that the user doesn’t need to know about.
Parameters: parser (argparser.ArgumentParser) – Argument parser object.
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
add_jobcontrol_options
(parser)¶ Adds job control options to the provided parser.
Parameters: parser (argparser.ArgumentParser) – Argument parser object.
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
add_database_options
(parser)¶ Adds database screening options to the provided parser.
Parameters: parser (argparser.ArgumentParser) – Argument parser object
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
add_matching_options
(parser)¶ Adds matching options to the provided parser.
Parameters: parser (argparser.ArgumentParser) – Argument parser object.
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
add_reporting_options
(parser)¶ Adds reporting options to the provided parser.
Parameters: parser (argparser.ArgumentParser) – Argument parser object
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
add_scoring_options
(parser)¶ Adds scoring/filtering options to the provided parser.
Parameters: parser (argparser.ArgumentParser) – Argument parser object.
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
combine_hit_files
(args, subjobs)¶ Combines hit files for the supplied subjobs.
Parameters: - args (argparse.Namespace) – Command line arguments
- subjobs (list(str)) – Subjob names
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
distribute_hypos
(hypos, num_zip_files, jobname)¶ Distributes the supplied hypotheses equally over the indicated number of zip files and returns the names of those zip files.
Parameters: - hypos (list(PhpHypoAdaptor)) – Hypotheses
- num_zip_files (int) – Number of zip files to create
- jobname (str) – Job name
Returns: Names of zip files
Return type: list(str)
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_common_args
(args)¶ Returns a command containing arguments that are common to all subjobs.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: Command with common arguments Return type: list(str)
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_hypos
(hypo_file)¶ Reads hypothesis or hypotheses from a .phypo or .zip file.
Parameters: hypo_file – A .phypo or .zip file Returns: list of one or more hypotheses Return type: list(PhpHypoAdaptor)
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_min_sites
(hypo, user_match)¶ Returns the minimum number of sites that must be matched in the supplied hypothesis. This may come from user_match or from the PHASE_MIN_SITES property in the hypothesis. If neither is specified, it will be the total number of sites in the hypothesis.
Parameters: - hypo (PhpHypoAdaptor) – pharmacophore hypothesis
- user_match – User-specified minimum number of sites or None
Returns: Minimum number of sites to match
Return type: int
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_num_subjobs
(args)¶ Returns the number of subjobs requested on the command line via the -NJOBS or -HOST option.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: Number of subjobs Return type: int
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_parser
()¶ Creates argparse.ArgumentParser with supported command line options.
Returns: Argument parser object Return type: argparser.ArgumentParser
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_source_files
(source)¶ Returns the names of the files/databases/zipped projects to be screened, taking proper account of whether the current process is running under job control.
Parameters: source (str) – A legal source of structures to screen Returns: Names of files/database/zipped projects to screen Return type: list(str)
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
get_source_format
(source)¶ Returns the format of source as a SourceFormat object.
param source: The name of a file, database or zipped project type source: str
Returns: The format of source Return type: SourceFormat
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
prepend_hypos
(args)¶ Prepends pharmacophore hypotheses to the hit file.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
remove_output_files
(args)¶ Removes output files that would be created in the launch directory by the parent job.
Parameters: args (argparse.Namespace) – Command line arguments
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
setup_db_screen
(args, db_paths)¶ Does setup for a distributed database screen.
Parameters: - args (argparser.Namespace) – argparser.Namespace with command line options
- db_paths (list(str)) – Databases to screen
Returns: list of subjob commands
Return type: list(list(str))
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
setup_distributed_screen
(args)¶ Does all the setup required to launch distributed subjobs. This includes splitting input files or database subsets, and creation of the files <subjob>_inputs.list, which contain the names of the input files for each subjob. Returns a list of subjob commands that can be supplied directly to JobDJ.addJob. The number of commands may be larger than the number CPUs requested if the -NJOBS option is used to divide the work over a larger number of work units. Conversely, the number of commands may be smaller than requested if the provided source(s) of structures cannot be subdivided as requested (e.g., 2 multi-conformer files cannot be split over more than 2 subjobs).
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: list of subjob commands Return type: list(list(str))
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
setup_fixed_file_screen
(args, file_names)¶ Does setup for a distributed file screen where multiple conformers per molecule are present and thus the files cannot be split. Note that the maximum number of subjobs will not exceed the number of input files, and the load balancing may be less than optimal if the input files differ significantly in their numbers of molecules and/or conformers.
Parameters: - args (argparser.Namespace) – argparser.Namespace with command line options
- file_names (list(str)) – Files to screen with runtime paths
Returns: list of subjob commands
Return type: list(list(str))
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
setup_project_screen
(args, project_names)¶ Does setup for a distributed screen of zipped projects. This workflow is used only by phase_find_common, where a project of actives and a project of decoys are screened against the top-n pharmacophore hypotheses found by the common pharmacophore algorithm. Because we can’t unzip a project and hope that its database lands on a cross-mounted disk, we can’t readily divide the record numbers of the project database over multiple subjobs, as we do for a standard database screen. The most practical approach is to divide the hypotheses equally over the subjobs and have each subjob screen its own local copies of the unzipped project databases.
Parameters: - args (argparser.Namespace) – argparser.Namespace with command line options
- project_names (list(str)) – Zipped projects to screen with runtime paths
Returns: list of subjob commands
Return type: list(list(str))
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
setup_split_file_screen
(args, file_names)¶ Does setup for a distributed file screen with splitting of the input files so that each subjob receives a single file with approximately the same number of structures as the other subjobs.
Parameters: - args (argparser.Namespace) – argparser.Namespace with command line options
- file_names (list(str)) – Files to screen with runtime paths
Returns: list of subjob commands
Return type: list(list(str))
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
validate_args
(args)¶ Checks the validity of command line options.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: tuple of validity and error message if not valid Return type: bool, str
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
validate_dbsites
(args)¶ Checks the legality of the -dbsites option w.r.t. to all databases and hypotheses. Should be called only after job is running on remote host.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: tuple of validity and error message if invalid Return type: bool, str
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
validate_hypo
(args)¶ Checks the validity of the hypothesis or hypotheses.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: tuple of validity and error message if not valid Return type: bool, str
-
schrodinger.application.phase.packages.phase_screen_driver_utils.
validate_source
(args)¶ Checks the validity of the source of structures to screen and the validity of the command line options w.r.t. the source type.
Parameters: args (argparser.Namespace) – argparser.Namespace with command line options Returns: tuple of validity and error message if not valid Return type: bool, str