schrodinger.livedesign.preprocessor module

class schrodinger.livedesign.preprocessor.ExplicitHydrogens(value)[source]

Bases: enum.Enum

An enumeration.

REMOVE_ALL = 1
KEEP_WEDGED = 2
ADD_ALL = 3
AS_IS = 4
class schrodinger.livedesign.preprocessor.GenerateCoordinates(value)[source]

Bases: enum.Enum

An enumeration.

NONE = 1
FULL = 2
FULL_ALIGNED = 3
class schrodinger.livedesign.preprocessor.DoubleBondStereoStandard(value)[source]

Bases: enum.Enum

An enumeration.

NONE = 1
CROSSED = 2
WIGGLY = 3
class schrodinger.livedesign.preprocessor.RingRepresentation(value)[source]

Bases: enum.Enum

An enumeration.

KEKULE = 1
AROMATIC = 2
class schrodinger.livedesign.preprocessor.RemoveSGroupData(value)[source]

Bases: enum.Enum

An enumeration.

NONE = 1
ALL = 2
DAT_ONLY = 3
class schrodinger.livedesign.preprocessor.ChiralFlagTreatment(value)[source]

Bases: enum.Enum

An enumeration.

IGNORE = 1
CLEAN = 2
FORCE_ON = 3
FORCE_OFF = 4
class schrodinger.livedesign.preprocessor.PreprocessorOptions(KEEP_ONLY_LARGEST_STRUCTURE: bool = True, REMOVE_PROPERTIES: bool = False, STRIP_SALTS: Tuple[str] = ('[Cl, Br, I]', '[Li, Na, K, Ca, Mg]', '[O, N]', '[N](=O)(O)O', '[P](=O)(O)(O)O', '[P](F)(F)(F)(F)(F)F', '[S](=O)(=O)(O)O', '[CH3][S](=O)(=O)(O)', 'c1cc([CH3])ccc1[S](=O)(=O)(O)', '[CH3]C(=O)O', 'FC(F)(F)C(=O)O', 'OC(=O)C=CC(=O)O', 'OC(=O)C(=O)O', 'OC(=O)C(O)C(O)C(=O)O', 'C1CCCCC1[NH]C1CCCCC1'), CLEAN_WEDGE_ORIENTATION: bool = True, CHOOSE_CANONICAL_TAUTOMER: bool = False, TRANSFORMATIONS: Tuple[str] = ('[#8:2]=[#7:1]=[#8:3]>>[#8-:2]-[#7+:1]=[#8:3]', '[#6:3]-[#7H2:1]=[#8:2]>>[#6:3]-[#7H2+:1]-[#8-:2]', '[#6:3][P-:1]([#6:4])([#6:5])[#6+:2]>>[#6:3][P-0:1]([#6:4])([#6:5])=[#6+0:2]', '[#6:3][S;X3+0:1]([#6:4])=[#8-0:2]>>[#6:3][S+:1]([#6:4])-[#8-:2]', '[#6:3][P+:1]([#8;X2:4])([#8;X2:5])[#8-:2]>>[#6:3][P+0:1]([#8:4])([#8:5])=[#8-0:2]', '[#6:3][S+:1]([#6:4])([#8-:2])=[O:5]>>[#6:3][S+0:1]([#6:4])(=[#8-0:2])=[O:5]', '[#7;A;X2-:1][N;X2+:2]#[N;X1:3]>>[#7-0:1]=[N+:2]=[#7-:3]', '[#6;X3-:1][N;X2+:2]#[N;X1:3]>>[#6-0;A:1]=[N+:2]=[#7-:3]'), NEUTRALIZE: bool = True, EXPLICIT_HYDROGENS: schrodinger.livedesign.preprocessor.ExplicitHydrogens = <ExplicitHydrogens.REMOVE_ALL: 1>, GENERATE_COORDINATES: schrodinger.livedesign.preprocessor.GenerateCoordinates = <GenerateCoordinates.FULL_ALIGNED: 3>, DOUBLE_BOND_STEREO_STANDARD: schrodinger.livedesign.preprocessor.DoubleBondStereoStandard = <DoubleBondStereoStandard.NONE: 1>, CHIRAL_FLAG_TREATMENT: schrodinger.livedesign.preprocessor.ChiralFlagTreatment = <ChiralFlagTreatment.CLEAN: 2>, HEAVY_HYDROGEN_DT: bool = False, RING_REPRESENTATION: schrodinger.livedesign.preprocessor.RingRepresentation = <RingRepresentation.KEKULE: 1>, GENERATE_V3K_SDF: bool = True, REMOVE_SGROUP_DATA: schrodinger.livedesign.preprocessor.RemoveSGroupData = <RemoveSGroupData.NONE: 1>, CLEAR_INVALID_WEDGE_BONDS: bool = True, STRIP_STEREO_ABSOLUTE_GROUP: bool = True, STRIP_AND_GROUPS_ON_SINGLE_ATOM: bool = True)[source]

Bases: tuple

Options dictating preprocessor actions; all options default to no-ops unless otherwise specified.

KEEP_ONLY_LARGEST_STRUCTURE: bool

Alias for field number 0

REMOVE_PROPERTIES: bool

Alias for field number 1

STRIP_SALTS: Tuple[str]

Alias for field number 2

CLEAN_WEDGE_ORIENTATION: bool

Alias for field number 3

CHOOSE_CANONICAL_TAUTOMER: bool

Alias for field number 4

TRANSFORMATIONS: Tuple[str]

Alias for field number 5

NEUTRALIZE: bool

Alias for field number 6

EXPLICIT_HYDROGENS: schrodinger.livedesign.preprocessor.ExplicitHydrogens

Alias for field number 7

GENERATE_COORDINATES: schrodinger.livedesign.preprocessor.GenerateCoordinates

Alias for field number 8

DOUBLE_BOND_STEREO_STANDARD: schrodinger.livedesign.preprocessor.DoubleBondStereoStandard

Alias for field number 9

CHIRAL_FLAG_TREATMENT: schrodinger.livedesign.preprocessor.ChiralFlagTreatment

Alias for field number 10

HEAVY_HYDROGEN_DT: bool

Alias for field number 11

RING_REPRESENTATION: schrodinger.livedesign.preprocessor.RingRepresentation

Alias for field number 12

GENERATE_V3K_SDF: bool

Alias for field number 13

REMOVE_SGROUP_DATA: schrodinger.livedesign.preprocessor.RemoveSGroupData

Alias for field number 14

CLEAR_INVALID_WEDGE_BONDS: bool

Alias for field number 15

STRIP_STEREO_ABSOLUTE_GROUP: bool

Alias for field number 16

STRIP_AND_GROUPS_ON_SINGLE_ATOM: bool

Alias for field number 17

static fromConfig(config: dict)[source]
Parameters

config – configuration from which to build options

Raises
  • KeyError – if an unknown key is present

  • ValueError – if an unknown value is present

toConfig() → dict[source]
__contains__(key, /)

Return key in self.

__len__()

Return len(self).

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

schrodinger.livedesign.preprocessor.atom_has_non_standard_query(atom) → bool[source]

Checks if the atom has a non-standard query feature like M which rdkit doesn’t consider as a query

Parameters

atom – the atom to check

:returns whether or not the atom has a non standard query feature

schrodinger.livedesign.preprocessor.s_group_has_non_standard_query(s_group) → bool[source]

Checks if the s-group has a non-standard query feature like M which rdkit doesn’t consider as a query

Parameters

s_group – the s-group to check

:returns whether or not the s-group has a non standard query feature

schrodinger.livedesign.preprocessor.is_queryatom_exception(atom)[source]

Normally we raise an exception if query atoms are in the molecule to be preprocessed. This function returns True for query atoms which are allowed.

Parameters

atom – the atom to check

:returns whether or not the atom is allowed in the preprocessor

schrodinger.livedesign.preprocessor.coords_all_zero(conf)[source]

Returns whether or not all atom positions in a conformer are zero

schrodinger.livedesign.preprocessor.setup_mol(mol)[source]

Setup on a molecule that is always done regardless of configuration.

Parameters

mol – An unsanitized RDKit Mol

Returns

A partially sanitized RDKit mol, ready for the standardizer.

schrodinger.livedesign.preprocessor.do_final_corrections(molblock, generated_coordinates, v3000)[source]
schrodinger.livedesign.preprocessor.preprocess_molblock(molblock: str, config: Optional[dict] = None) → str[source]

Process molecule based on config objects

NOTE: transforms are done before neutralization and tautomer canonicalization

schrodinger.livedesign.preprocessor.preprocess(mol: rdkit.Chem.rdchem.Mol, options: Optional[schrodinger.livedesign.preprocessor.PreprocessorOptions] = None) → str[source]
schrodinger.livedesign.preprocessor.handle_pentavalent_nitrogens(mol)[source]

Pentavalent nitrogens are usually the incorrect representation and will break most of the downstream functions. We will allow pentavalent nitrogens form for Nitro groups (-NO2), but other instances will be transformed to the charge separated form.

schrodinger.livedesign.preprocessor.handle_aromatic_heteroatoms_with_attachment_points(mol)[source]

Aromatic atoms with attachment points are impossible to kekulize, clear that up here by making it explicit that they have an implicit H attached (This was SS-31328)

schrodinger.livedesign.preprocessor.add_explicit_hydrogens(mol)[source]
schrodinger.livedesign.preprocessor.remove_explicit_hydrogens(mol, keep_wedged=False)[source]
schrodinger.livedesign.preprocessor.convert_to_molblock(mol, v3000, kekulize)[source]
schrodinger.livedesign.preprocessor.convert_heavy_hydrogens(molblock, v3000)[source]

NOTE that this operates on a molblock, not a molecule

The RDKit does not currently (v2020.03) support writing D or T to mol blocks, so we need to post-process the text. Fortunately it’s an easy regex in v3000 mol blocks. This does not work with V2000 mol blocks, so we throw a ValueError there. This doesn’t seem like a big deal since V2000 support is primarly being kept around for debugging purposes. If we need to eventually support V2000+HEAVY_HYDROGEN_DT, some not-completely-trivial code will need to be written.

schrodinger.livedesign.preprocessor.neutralize(mol, checkForProblematicHs=False)[source]
schrodinger.livedesign.preprocessor.unicode_to_str(unicode_str)[source]

Takes a unicode object and converts it to a str (utf-8). If the arg is already a str, returns unicode_str (i.e. if run with python 3). Needed to support python 2/3 with unicode_literals.

py2: type<unicode> -> type<str utf-8> py3: type<str utf-8> (no unicode type exists)

Parameters

unicode_str (unicode (py2) or str (py3)) – the unicode that potentially needs converting (i.e. if run with python 2)

Returns

str

schrodinger.livedesign.preprocessor.transform(mol, transformation, maxTransformations=1000)[source]

apply the transformation to the molecule repeatedly until it no longer applies.

the maxTransformations argument is just there to prevent us from ending up in an infinite loop due to a bogus transformation

schrodinger.livedesign.preprocessor.in_xy_plane(mol)[source]
schrodinger.livedesign.preprocessor.generate_coordinates(mol, align=False)[source]
schrodinger.livedesign.preprocessor.generate_canonical_tautomer(mol)[source]
schrodinger.livedesign.preprocessor.clear_wedge_bonds_from_achiral_centers(mol)[source]
schrodinger.livedesign.preprocessor.chiral_flag_treatment(adj_mode, src_mol, mol)[source]
schrodinger.livedesign.preprocessor.strip_stereo_abs(input_mol)[source]

Removes any Stereo ABS group

Parameters

input_mol – The original molecule to consider

Returns

post-processed molecule, if the input molecule was modified

schrodinger.livedesign.preprocessor.strip_stereo_and(input_mol)[source]

Removes any Stereo AND groups with only one center and flattens the bonds around it

Parameters

input_mol – The original molecule to consider

Returns

post-processed molecule, if the input molecule was modified

schrodinger.livedesign.preprocessor.frag_is_smaller(atoms, largest_atoms, weight, largest_weight, smiles, largest_smiles)[source]

A fragment is considered larger if its atoms/weight are larger, the length of the smiles string is larger, or the smiles string is lexicographically smaller if they are equal length. ie, ‘AAA’ is larger than ‘AAB’.. hence the final smiles > largest_smiles check here to reject

schrodinger.livedesign.preprocessor.connect_variable_attachment_points(mol)[source]

forms zero-order bonds between one of the atoms of a bond with an ATTACH property to the “main” molecule in order to have the molecule+variable attachment point treated as a single fragment

returns a 2-tuple with:
  1. the modified molecule

  2. whether or not the molecule was modified

schrodinger.livedesign.preprocessor.remove_fragments(mol)[source]

Fragments are not removed if the molecule contains any SGroups which are associated with polymers

Use the following criteria to remove unwanted fragements from mol:
  1. keep only the fragment which has the most number of atoms

  2. break ties by keeping only fragments with the greatest molecular weight

  3. break ties with the longest smiles string

  4. break additional ties by keeping the fragment with the earliest alpha sorted SMILES string

If two or more identical fragments remain after 1-4, we will throw a fatal error.

schrodinger.livedesign.preprocessor.clear_brackets_from_sgroups(mol)[source]

Removes brackets from any s-groups

schrodinger.livedesign.preprocessor.remove_properties(mol)[source]
schrodinger.livedesign.preprocessor.remove_sgroups(mol, which)[source]
schrodinger.livedesign.preprocessor.strip_salts(mol, salt_list)[source]
schrodinger.livedesign.preprocessor.add_chiral_hs(mol)[source]
schrodinger.livedesign.preprocessor.wedge_clean(mol)[source]
schrodinger.livedesign.preprocessor.reapply_molblock_wedging(mol)[source]
schrodinger.livedesign.preprocessor.remove_wiggly_bonds_around_double_bonds(mol)[source]
schrodinger.livedesign.preprocessor.set_double_bond_stereo(mol, mol_block, bond_type)[source]
schrodinger.livedesign.preprocessor.main(argv=None)[source]

Function to run preprocessor directly from the command line.