schrodinger.application.phase.packages.phase_screen_driver_utils module

Module with functionality used by phase_screen_driver.py.

Copyright Schrodinger LLC, All Rights Reserved.

class schrodinger.application.phase.packages.phase_screen_driver_utils.SourceFormat

Bases: enum.Enum

An enumeration.

database = 2
file = 1
project = 3
schrodinger.application.phase.packages.phase_screen_driver_utils.add_hidden_options(parser)

Adds options that the user doesn’t need to know about.

Parameters:parser (argparser.ArgumentParser) – Argument parser object.
schrodinger.application.phase.packages.phase_screen_driver_utils.add_jobcontrol_options(parser)

Adds job control options to the provided parser.

Parameters:parser (argparser.ArgumentParser) – Argument parser object.
schrodinger.application.phase.packages.phase_screen_driver_utils.add_database_options(parser)

Adds database screening options to the provided parser.

Parameters:parser (argparser.ArgumentParser) – Argument parser object
schrodinger.application.phase.packages.phase_screen_driver_utils.add_matching_options(parser)

Adds matching options to the provided parser.

Parameters:parser (argparser.ArgumentParser) – Argument parser object.
schrodinger.application.phase.packages.phase_screen_driver_utils.add_reporting_options(parser)

Adds reporting options to the provided parser.

Parameters:parser (argparser.ArgumentParser) – Argument parser object
schrodinger.application.phase.packages.phase_screen_driver_utils.add_scoring_options(parser)

Adds scoring/filtering options to the provided parser.

Parameters:parser (argparser.ArgumentParser) – Argument parser object.
schrodinger.application.phase.packages.phase_screen_driver_utils.combine_hit_files(args, subjobs)

Combines hit files for the supplied subjobs.

Parameters:
  • args (argparse.Namespace) – Command line arguments
  • subjobs (list(str)) – Subjob names
schrodinger.application.phase.packages.phase_screen_driver_utils.distribute_hypos(hypos, num_zip_files, jobname)

Distributes the supplied hypotheses equally over the indicated number of zip files and returns the names of those zip files.

Parameters:
  • hypos (list(PhpHypoAdaptor)) – Hypotheses
  • num_zip_files (int) – Number of zip files to create
  • jobname (str) – Job name
Returns:

Names of zip files

Return type:

list(str)

schrodinger.application.phase.packages.phase_screen_driver_utils.get_common_args(args)

Returns a command containing arguments that are common to all subjobs.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:Command with common arguments
Return type:list(str)
schrodinger.application.phase.packages.phase_screen_driver_utils.get_hypos(hypo_file)

Reads hypothesis or hypotheses from a .phypo or .zip file.

Parameters:hypo_file – A .phypo or .zip file
Returns:list of one or more hypotheses
Return type:list(PhpHypoAdaptor)
schrodinger.application.phase.packages.phase_screen_driver_utils.get_min_sites(hypo, user_match)

Returns the minimum number of sites that must be matched in the supplied hypothesis. This may come from user_match or from the PHASE_MIN_SITES property in the hypothesis. If neither is specified, it will be the total number of sites in the hypothesis.

Parameters:
  • hypo (PhpHypoAdaptor) – pharmacophore hypothesis
  • user_match – User-specified minimum number of sites or None
Returns:

Minimum number of sites to match

Return type:

int

schrodinger.application.phase.packages.phase_screen_driver_utils.get_num_subjobs(args)

Returns the number of subjobs requested on the command line via the -NJOBS or -HOST option.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:Number of subjobs
Return type:int
schrodinger.application.phase.packages.phase_screen_driver_utils.get_parser()

Creates argparse.ArgumentParser with supported command line options.

Returns:Argument parser object
Return type:argparser.ArgumentParser
schrodinger.application.phase.packages.phase_screen_driver_utils.get_source_files(source)

Returns the names of the files/databases/zipped projects to be screened, taking proper account of whether the current process is running under job control.

Parameters:source (str) – A legal source of structures to screen
Returns:Names of files/database/zipped projects to screen
Return type:list(str)
schrodinger.application.phase.packages.phase_screen_driver_utils.get_source_format(source)

Returns the format of source as a SourceFormat object.

param source: The name of a file, database or zipped project type source: str

Returns:The format of source
Return type:SourceFormat
schrodinger.application.phase.packages.phase_screen_driver_utils.prepend_hypos(args)

Prepends pharmacophore hypotheses to the hit file.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
schrodinger.application.phase.packages.phase_screen_driver_utils.remove_output_files(args)

Removes output files that would be created in the launch directory by the parent job.

Parameters:args (argparse.Namespace) – Command line arguments
schrodinger.application.phase.packages.phase_screen_driver_utils.setup_db_screen(args, db_paths)

Does setup for a distributed database screen.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line options
  • db_paths (list(str)) – Databases to screen
Returns:

list of subjob commands

Return type:

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_distributed_screen(args)

Does all the setup required to launch distributed subjobs. This includes splitting input files or database subsets, and creation of the files <subjob>_inputs.list, which contain the names of the input files for each subjob. Returns a list of subjob commands that can be supplied directly to JobDJ.addJob. The number of commands may be larger than the number CPUs requested if the -NJOBS option is used to divide the work over a larger number of work units. Conversely, the number of commands may be smaller than requested if the provided source(s) of structures cannot be subdivided as requested (e.g., 2 multi-conformer files cannot be split over more than 2 subjobs).

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:list of subjob commands
Return type:list(list(str))
schrodinger.application.phase.packages.phase_screen_driver_utils.setup_fixed_file_screen(args, file_names)

Does setup for a distributed file screen where multiple conformers per molecule are present and thus the files cannot be split. Note that the maximum number of subjobs will not exceed the number of input files, and the load balancing may be less than optimal if the input files differ significantly in their numbers of molecules and/or conformers.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line options
  • file_names (list(str)) – Files to screen with runtime paths
Returns:

list of subjob commands

Return type:

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_project_screen(args, project_names)

Does setup for a distributed screen of zipped projects. This workflow is used only by phase_find_common, where a project of actives and a project of decoys are screened against the top-n pharmacophore hypotheses found by the common pharmacophore algorithm. Because we can’t unzip a project and hope that its database lands on a cross-mounted disk, we can’t readily divide the record numbers of the project database over multiple subjobs, as we do for a standard database screen. The most practical approach is to divide the hypotheses equally over the subjobs and have each subjob screen its own local copies of the unzipped project databases.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line options
  • project_names (list(str)) – Zipped projects to screen with runtime paths
Returns:

list of subjob commands

Return type:

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.setup_split_file_screen(args, file_names)

Does setup for a distributed file screen with splitting of the input files so that each subjob receives a single file with approximately the same number of structures as the other subjobs.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line options
  • file_names (list(str)) – Files to screen with runtime paths
Returns:

list of subjob commands

Return type:

list(list(str))

schrodinger.application.phase.packages.phase_screen_driver_utils.validate_args(args)

Checks the validity of command line options.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:tuple of validity and error message if not valid
Return type:bool, str
schrodinger.application.phase.packages.phase_screen_driver_utils.validate_dbsites(args)

Checks the legality of the -dbsites option w.r.t. to all databases and hypotheses. Should be called only after job is running on remote host.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:tuple of validity and error message if invalid
Return type:bool, str
schrodinger.application.phase.packages.phase_screen_driver_utils.validate_hypo(args)

Checks the validity of the hypothesis or hypotheses.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:tuple of validity and error message if not valid
Return type:bool, str
schrodinger.application.phase.packages.phase_screen_driver_utils.validate_source(args)

Checks the validity of the source of structures to screen and the validity of the command line options w.r.t. the source type.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:tuple of validity and error message if not valid
Return type:bool, str