schrodinger.application.phase.packages.mmp3d_driver_utils module

Provides argument parsing, job setup/cleanup, and other and miscellaneous functionality for phase_mmp3d_driver.py.

Copyright Schrodinger LLC, All Rights Reserved.

class schrodinger.application.phase.packages.mmp3d_driver_utils.SubjobType

Bases: enum.Enum

An enumeration.

align_pairs = 2
smiles_to_3d = 1
schrodinger.application.phase.packages.mmp3d_driver_utils.add_transformations(mmp2d_path, maefile)

Adds MMP transformation dictionaries to each of the aligned pairs in maefile and overwrites the file with the updated structures. The transformations (see mmp2d.get_transformations) are stored in the property MMP2D_TRANSFORMATIONS as a base64-encoded string which holds a JSON-encoded representation of the underlying Python data structure. Use decode_transformations to extract the data.

Parameters:
  • mmp2d_path (str) – Path to MMP 2D database
  • maefile (str) – Maestro file with pairs of alignments. Overwritten.
schrodinger.application.phase.packages.mmp3d_driver_utils.align_pairs(project_path, id_pairs_file, out_mae, verbose=False)

Aligns pairs of multi-conformer ligands and writes the alignments to a Maestro file.

Parameters:
  • project_path (str) – Phase project containing the ligands
  • id_pairs_file – CSV file with sorted pairs of ligand IDs
  • out_mae (str) – Output Maestro file for alignments
  • verbose (bool) – If true, a one-line summary will be printed for each pair
schrodinger.application.phase.packages.mmp3d_driver_utils.combine_alignments(args, nsub)

Combines pairwise alignments from subjobs.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line options
  • nsub (int) – Number of subjobs
schrodinger.application.phase.packages.mmp3d_driver_utils.convert_smiles_to_3d(in_smiles, out_mae, verbose=False)

Creates 3D structures from SMILES.

Parameters:
  • in_smiles (str) – Input CSV file with id, public_id, smiles, and prop_value
  • out_mae (str) – Output Maestro file for 3D structures
  • verbose (bool) – If true, each input row from infile will be printed
schrodinger.application.phase.packages.mmp3d_driver_utils.create_phase_project(project_path, maefiles)

Creates a multi-conformer Phase project from the supplied Maestro files.

Parameters:
  • project_path (str) – Path to project to be created
  • maefiles (str) – List of Maestro files with one conformer per compound
schrodinger.application.phase.packages.mmp3d_driver_utils.decode_transformations(st)

Decodes the MMP transformations in the provided structure and returns them as a list of dictionaries. Each dictionary contains the following key, value pairs, which describe a single transformation linking the provided structure to its associated MMP:

Key Value — —– TRANS_KEYS.FROM_SMILES MMP fragment SMIRKS for the first compound (str) TRANS_KEYS.TO_SMILES MMP fragment SMIRKS for the second compound (str) TRANS_KEYS.MIN The min statistic for the transformation (float) TRANS_KEYS.MAX The max statistic for the transformation (float) TRANS_KEYS.AVG The avg statistic for the transofrmation (float) TRANS_KEYS.STD The std statistic for the transformation (float) TRANS_KEYS.COUNT The count statistic for the transformation (int)

Note that TRANS_KEYS is defined in the mmp2d module.

Parameters:st (structure.Structure) – Structure containing the property MMP2D_TRANSFORMATIONS
Returns:List of transformation dictionaries
Return type:list[dict{str: str/str/float/float/float/float/int}]
schrodinger.application.phase.packages.mmp3d_driver_utils.get_compound_id_pairs(maefile)

Given a Maestro file with pairs of aligned structures, this function reads pairs of compound ids from MMP3D_ID_PROP and returns the pairs in a set.

Parameters:maefile (str) – Maestro file with pairs of alignments
Returns:Set containing pairs of compound ids
Return type:set((int, int))
schrodinger.application.phase.packages.mmp3d_driver_utils.get_parser()

Creates argparse.ArgumentParser with supported command line options.

Returns:Argument parser object
Return type:argparser.ArgumentParser
schrodinger.application.phase.packages.mmp3d_driver_utils.get_ligand_id_pairs(project_path, mmp_id_pairs)

Returns pairs of ligand IDs in the supplied Phase project that correspond to the provided pairs of compound IDs from the MMP 2D database. A pair will be skipped if either of the compounds in the pair failed to be imported into the project due to size or other characteristics that are unacceptable to phase_database.

Parameters:
  • project_path (str) – Path to Phase project
  • mmp_id_pairs (list((int, int))) – Pairs of compounds IDs from MMP 2D database
Returns:

Pairs of ligand IDs

Return type:

list((int, int))

schrodinger.application.phase.packages.mmp3d_driver_utils.get_num_subjobs(args, total_inputs, subjob_type)

Returns the number of subjobs to run, taking into account the requested number of CPUs and the minimum allowed inputs per subjob.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line options
  • total_inputs (int) – Total number of inputs to be distributed over subjobs
  • subjob_type (SubjobType) – Subjob type
Returns:

Number of subjobs

Return type:

int

schrodinger.application.phase.packages.mmp3d_driver_utils.get_parent_jobname(args)

Returns parent job name of the current subjob.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:Parent job name
Return type:str
schrodinger.application.phase.packages.mmp3d_driver_utils.setup_distributed_align_pairs(args)

Does setup for distributed alignment of activity cliff pairs.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:list of subjob commands
Return type:list(list(str))
schrodinger.application.phase.packages.mmp3d_driver_utils.setup_distributed_smiles_to_3d(args)

Does setup for a distributed conversion of SMILES to 3D.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line options
Returns:list of subjob commands
Return type:list(list(str))
schrodinger.application.phase.packages.mmp3d_driver_utils.split_inputs(args, rows, subjob_type, nsub=None)

Divides rows of input data over subjob CSV files and returns the commands to run the subjobs.

Parameters:
  • args (argparser.Namespace) – argparser.Namespace with command line arguments
  • rows (list(tuple)) – Rows to split
  • subjob_type (SubjobType) – Subjob type
  • nsub (int) – Overrides automatic determination of number of subjobs
Returns:

list of subjob commands

Return type:

list(list(str))

schrodinger.application.phase.packages.mmp3d_driver_utils.validate_args(args)

Checks the validity of command line arguments.

Parameters:args (argparser.Namespace) – argparser.Namespace with command line arguments
Returns:tuple of validity and non-empty error message if not valid
Return type:bool, str